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THREAD DISPATC!HER FOR MULTI-THREADED COMMUNICATION LIBRARY 



This invention relates to thread dispatching in a multi-threaded communication library^ 
and more particularly relates to efficient dispatching of threads which become runnable by 
completion of communication events. The choice of which thread to dispatch is based on the 
5 state of the message pass\ng system to allow for maximum efficiency of the communication 
infrastructure. 

Background of the Inveniion: 

O In order to better understand the background of the subject invention, explanation of 

J certain terminology is first provided. A term well-known in the art as a symmetric 

W multi-processor (SMP) refers to an aspect of hardware in a computing system and, more 

^ particularly, relates to the physical layout and design of the processor planar itself. Such multiple 

processor units have, as one characteristic, the sharing of global memory as well as equal access 

to I/O of the SMP sy sten: . 

^ Another term which is commonly associated with modem complex computing systems is 

|i a "thread." The term "thiead" in a general sense refers merely to a simple execution path through 
application software and the kernel of an operating system executing with the computer. As is 
well understood in the art, it is commonplace for multiple such threads to be allowed per a single 
process image. All threads of a process share the same address space which allows for efficient 
communication and synchronization among the various threads of execution in the process. 

20 A thread standard has now been incorporated into the POSIX standard (1003c.l). Basic 

thread management under the POSIX standard is described, for example, in a publication by K. 
Robbins and S. Robbins entitied Practical UNIX Programming - A Guide To Concurrencv. 
Communication and Multi-threading. Prentice Hall PTR (1996). 
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Another concept which is utiUzed hereinafter in describing the invention is one of "thread 
locks" or "mutexes." It is typical in modem computing systems to include critical sections of 
code or shared data structures whose integrity is extremely important to the correct operation of 
the system. Locks/mutexes are, in general, devices employed in software (or hardware) to 
"serialize" access to these critical sections of code and/or shared data structures. 

Two types of locks are often encountered in the art, namely blocking locks and simple or 
"spin" locks. Blocking Ic cks are of the form which cause a thread requesting the lock to cease 
being runnable, e.g., to go to "sleep" as the term is employed in the art, if the lock is currently 
held by another thread. Spin locks, in contrast, do not put waiting threads to "sleep", but rather, 
the waiting threads execute a spin loop, and thus repeatedly continue to request the lock until it is 
freed by the cxirrent thread "owner." Spin locks therefore continue to consume CPU cycles if the 
lock the thread is waiting for is owned by a different thread. Blocking locks are typically used 
for large critical sections of code or if the operating system kernel must differentiate between 
threads requiring data stnxcture read-only capability and threads requiring the capability to 
modify the data structure! s). 

One other term to note is the concept of code being multithread-safe. Code is considered 
to be thread/MP-safe if n-iultiple execution threads contending for the same resource or routine 
are serialized such that data integrity is insured for all threads. One way of effecting this is by 
means of the aforementioned locks. 

Presently, thread locking employs standard POSIX mutex functions. These standard 
POSIX functions include thread_mutex_lock and thread_mutex_unlock which are described, for 
example, in the above-reierenced publication by K. Robbins & S. Robbins entitled Practical 
UNIX Programming - A Guide to Concurrency. Communication and Multi-threading . These 
fvmctions are designed to enhance portability of applications running on several operating 
systems. 
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A communication library is a set of functions by which processes (tasks) can send, 
receive, and wait for messages to/from each other. A typical communication library provides 
means for a receiver of a message to discriminate among possible messages that have been sent. 
This is often called "message matching logic." 

5 In a multi-threaded communication library, multiple threads can be waiting for messages 

to be received from other tasks. In prior versions of the MPI library available from IBM, when a 
message was received, the first thread to be waiting was notified of a waiting message. It awoke 
and checked to see if the message was for it. If not, it awakened the next waiting thread, and so 
on, until the thread waituig for the specific message was awakened. The extra work in 

1 0 awakening threads which, have no work to do creates inefficiency. 

in PARALLELIZED MANAGEMENT OF ADVANCED PROGRAM-TO-PROGRAM 

I COMMUNICATIONSA^M IN A SERVER SUPERSTRUCTURE, IBM Technical Disclosure 
f Bulletin, Vol. 38, No. 02 , Feb. 1995, PP 319-320, discloses running multiple threads, each 
ry thread being dispatched to handle an incoming message, the number of threads being dependent 
1^ on the message rate. All threads are equivalent, and there is no binding of messages to threads. 

CI MULTI-THREA:D sequencing in a small computer system INTERFACE 

g ENVIRONMENT, IBM Technical Disclosure Bulletin, Vol. 37, No. 09, Sept. 1994, PP 497-499, 
discloses a technique for properly sequencing commands to a multi-threaded hardware device by 
aimotating each command with a word which indicates which other thread must complete before 
20 this thread can start. In this way, a properly ordered queue of commands can be maintained. 

U.S. Patent No. 5,560,029 issued Sep. 24, 1996 to Papadopoulos et al. for DATA 
PROCESSING SYSTEM WITH SYNCHRONIZATION COPROCESSOR FOR MULTIPLE 
THREADS, discloses a distributed data flow computer, in which the threads are the sequences of 
machine instructions wh^ ch are queued and assigned to any available machine processor without 
25 distinction. The patent focuses especially on handling reads of remote memory, in which a 
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thread's next instruction is not queued until the remote memory request is satisfied. This 
enqueuing is done by hardware, and not assigned to any specific processor. 

U.S. Patent No. 5 .784,615 issued Jul. 21, 1998 to Lipe et al. for COMPUTER SYSTEM 
MESSAGrNG ARCHITECTURE, discloses a mechanism for passing messages between the 
5 various protection zones in the Windows 95 operating system. In the patent, "thread" is to be 
interpreted as sequence of machine instructions, and not the POSIX thread construct. The focus 
of the patent is on providing messaging services between secure and insecure domains of the 
operating system, by providing callback functions in the secure domain that can be invoked by a 
user in the insecure domeiin. There is no notion of thread synchronization or special dispatching 
10 techniques, other than a general mention of using a standard semaphore to allow two threads to 
ri cooperate. 

m U.S. Patent No. 5,758,184 issued May 26, 1998 to Lucovsky et al. for SYSTEM FOR 

J PERFORMING ASYNCHRONOUS FILE OPERATIONS REQUESTED BY RUNNABLE 
I THREADS BY PROCESSING COMPLETION MESSAGES WITH DIFFERENT QUEUE 
ij THREAD AND CHECKING FOR COMPLETION BY RUNNABLE THREADS, discloses a 
p technique for performing multiple simultaneous asynchronous input/output operations in a 

Computer Operating System, The focus of the patent is efficiently handling completion of I/O 
P operations using threads. 

U.S. Patent. No. 5,710,923 issued Jan. 20, 1998 to Jennings et al. for METHODS AND 
20 APPARATUS FOR EXCHANGING ACTIVE MESSAGES IN A PARALLEL PROCESSING 
COMPUTER SYSTEM, discloses a method for communicating active messages among nodes of 
a parallel processing con:puter system where an active message comprises a pointer to a function 
to be invoked at the target v^hen the message arrives at the target with a few parameters from the 
message being passed to the function upon arrival. 
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U.S. Patent No. 5. 548,760 issued Aug. 20, 1996 to Healey for MESSAGE HANDLER, 
discloses a message handler for passing messages between processes in a single threaded 
operating system. 

It is typical for a message passing library to provide a reliable transport mechanism for 
5 messages between tasks, a mechanism known in the art as "flow control" is incorporated. The 
flow control mechanism requires state to be maintained both at the sender and receiver of 
messages to ensure a reliable transport can occur. If messages are lost in transit they are 
retransmitted by the sender based on the state maintained. The flow control mechanism bounds 
the amount of state that needs to be maintained to guarantee the reliability of message delivery. 
10 The bounded state is also sometimes referred to in the art as the flow control window. The size of 
p the window is referred to in the art as tokens. Tokens are used up when messages are sent and are 
J fi-eed when the receiver acknov^ledges them thus advancing the window. A critical design aspect 
m for high performance message passing design systems is to ensure that the sending of messages 
,p and acknowledgments is tuned such that a sender is not blocked due to lack of tokens. In a 
^ multi-threaded message passing system where several threads are waiting for messages to arrive 
1, then send acknowledgments for freeing tokens, it is critical for the message passing system 

m to be able to dispatch the thread that is most likely to minimize senders being blocked due to 
g tokens. Efficient message passing systems therefore cannot simply rely on POSIX thread 
g dispatch routines for efficiient dispatch since the state to decide which thread to be dispatched for 
20 maximum efficiency is ir the message passing system and not in POSIX utility functions. 

Certain messages in multiprocessor message passing systems are more critical than 
others, for example, messages that typically deal with distributed lock manager in databases and 
file systems. It is more efficient to dispatch threads that process these performance critical 
messages before handling other messages. The ability to recognize certain messages as being 
25 more critical and dispatcldng the appropriate threads to process them is critical for efficient 
message passing systems. 
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The above examples show how state can be maintained efficiently in the message 
passing system to allow controlled thread dispatching for maximum efficiency. Our invention 
described in this disclosuj'e details an efficient mechanism by which the messaging system can 
control the dispatching ol' messaging threads to enhance its performance. 

5 Sunmiarv of the Invention : 

In the present inv(?ntion5 each thread has a thread-specific structure containing a "ready 
flag" and a POSIX thread condition variable unique to that thread. Each message is assigned a 
"handle." When a thread waits for a message, a thread-specific structure is attached to the 
message handle being waited on, and the thread is enqueued, waiting for its condition variable to 
ffl be signaled. When a mes sage completes (i.e., arrives, is matched, and is copied into the user 
m buffer), the message matc;hing logic sets the ready flag to READY, and causes the queue to be 
J5 examined. The queue mmager scans the queue of waiting threads, and sends a thread awakening 
condition signal to one of the threads with its ready flag set to READY. The queue manager can 
m implement any desired policy, including First-In-First-Out (FIFO), Last-In-First-Out (LIFO), or 
^ some other thread priorit;^ scheduling policy. This ensures that the thread which is awakened has 

the highest priority message to be processed, and enhances the efficiency of message delivery. 
Q1 The priority of the message to be processed is computed based on the overall design of the 
p message passing library, and can include giving priority to flow control messages as described in 
the examples given abovis. 

20 These and other objects will be apparent to one skilled in the art from the following 

drawings and detailed description of the invention. 

Brief Description of the Drawings: 

Fig, 1 depicts one example of a threaded computer environment usable with the present 
invention; 
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Fig. 2 is a representation of a queue for storing a plurality of Thread Queue Elements 
(TQE) therein; 

Fig. 3 is a represe]itation of a base structure or Pipe-control containing information shared 
by all threads; 

5 Figs. 4A and 4B, joined by connectors a-a, form a flowchart of a program of the present 

invention for controlling multi-threaded communications; 

Fig. 5 is a represeatation of a message handle structure, with fields for message source, 
tag, buffer address, maximum length, a "waited on" flag, and a notify address; 

f£ Fig. 6 illustrates tbe relationship of the elements of the invention at one point in the 

i§ operation; and 

^ Fig. 7 is a graph showing the improved performance of multithread communications 

- using the present invention over the method used under the prior art. 

Description of the Prefen'ed Embodiment: 

As shown in Fig. 1 , a computer environment 100 includes a plurality of computing nodes 
15 1 02 coupled to one another via a cormection 1 04. As one example, each computing node may 
comprise a node of an RS/6000 SP System offered by International Business Machines 
Corporation, and connection 104 may be a packet switch network, such as the SP switch or high 
performance switch (HPS), also offered by International Business Machines Corporation. Note 
again. Fig. 1 is presented by way of example only. The techniques disclosed herein could apply 
20 to any serial program or ciny multithreaded program running on a single machine in addition to 
the multi-processor environment depicted in Fig. 1. 
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Within environment 100, message packets are passed from a source computing node 
(sender) to a receiver computing node (receiver) via packet switch network 104, For example, a 
user task 106 of computing unit N may pass a message to a user task 106 of computing unit 1 
(receiver). Each user task can directly read data from and write data to an associated adapter 1 12, 
5 bypassing the overhead normally associated with having the operating system intervene in 

communication protocols. Adapter 1 12 couples computing unit 102 to switch 104. One example 
of switch 104 is described in detail in "IBM Parallel System Support Programs For AIX 
Administration Guide," Publication No. GC23-3897-02 (1996). 

As further explanation, communication between a computing unit and its associated 
10 adapter 1 12 is, for instance, described by an interface that includes functions, such as, open 

conmiunication, close communication, enable route, disable route, return status, and reset 
fi adapter. In one embodinnent, the interface comprises a message passing interface (MPI) 1 1 0, also 
Oj referred to herein as an MPI communication library. The MPI library comprises one example of 
^ a resource for which a lock mechanism in accordance with the present invention may be 
M employed. 

^ The MPI library is described in greater detail in, for example, an International Business 

y Machines Corporation publication entitled "IBM Parallel Environment For AIX: MPI 
O Programming and Subroutine Reference," Version 2, Release 4 (October, 1998), the entirety of 
which is hereby incorporated herein by reference. 

20 International Busi ness Machines Corporation's implementation of the MPI library is 

described in detail in various additional pubhcations. For example, reference an article in the 
IBM Systems Journal entitled "The Communication Software In Parallel Environment Of The 
IBM SP2," Vol. 34, No. 2, pp. 205-215 (1995). Further information on communication libraries 
is available in a textbook by W. Richard Stevens entitled UNIX Network Programming, 

25 published by Prentice Hall, Inc. (1990). Both of these references are hereby incorporated by 
reference in their entirety. 
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As noted, it is assiomed herein that the computing environment comprises a threaded 
computer environment so that the user task comprises a threaded user, and the library is a 
threaded MPI. A threaded computer environment is today well-known in the industry as one 
approach to implementing multi-node distributed processing. A threaded MPI library is available 
from International Business Machines Corporation as "IBM Parallel Environment For AIX," 
Version 2, Release 4 , IBM Product No. 7565-543 (October, 1998). This threaded MPI 
comprises a licensed program product which runs on the AIX system. "AIX" is the IBM version 
of the UNIX operating system. 

The system of Fig. 1 receives messages on multiple threads, and activates the correct 
thread to process the message. This is done by the MPI programs using the POSIX threads 
Ubrary, and particularly tiie implementation with the IBM Parallel Environment (PE) and Parallel 
System Support Program (PSSP) products. 

As part of the invention, at the time a thread is created using standard POSIX calls, a 
block of storage specific to that thread, called a Thread Queue Element (TQE), is created. Fig. 2 
is a representation of a TQE queue 20 storing such TQE's 10. The queue is represented by 
having the forward and back pointers contain addresses of other TQEs 10 in the queue. The TQE 

10 has the following fields: Forward Pointer 1 1, Back Pointer 12, State 13, Identification (ID) 
14, Queued flag 15, and Target Thread Condition structure 16. The Forward and Back Pointers 

1 1 and 12 are used to maintain a queue of TQE's 10, using well-known linked-list processing 
techniques. The State 13 can be READY or WAITING. A TQE 10 that has State=READY can 
be dequeued at any time. A TQE 10 that has State= WAITING can only be dequeued if there are 
no TQE's 10 with State=R£ADY. The Queued flag 15 is set to indicate whether the TQE 10 is 
part of a TQE queue 20, and the Target Thread Condition is a POSIX thread condition structure 
that can be waited on using the standard POSIX thread calls. The ID 14 is the POSIX thread ID 
used for additional user information. Initially, a TQE 10 is not enqueued in the queue 20, and 
has state=READY. 
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There is a TQE 10 for each message passing thread. In addition, there is a base structure 
25 of Fig, 3, wherein the base structure is called the Pipe-control. The Pipe-control 25 has the 
following fields (which are accessible by and common to all threads); TQE_queue_head 26, 
TQE_queue_tail 27, TQE;_ready__count 28, TQ_key 29, and TQ_mutex 30. The 

5 TQE_queue_head 26 and TQE_queue_tail 27 are pointers to the head and tail, respectively, of 
the TQE queue 20. The TQE_ready_count is the count of the number of TQE' s in the TQE 
queue 20 with state=REi^LDY. The TQ__key 29 is a value used to obtain the TQE 10 for the 
currently-rimning thread, and is a well-known part of the thread-specific storage functionality of 
POSIX threads. The TQ mutex is a POSIX mutex, used to serialize access to the pipe_control 

10 structure 25 and the TQE queue 20. Such serialization is required because the elements in 
Pipe-control 25 are accessed and modified by more than one thread. 

m Figs. 4A and 4B, joined at connectors a-a, form a flowchart of the message processing 

^ logic of the present invention. At 3 1, a thread wants to receive a message and starts the program. 
£ At 32 a buffer large enough to contain the message being received is allocated. At 34, a handle 
H is allocated for the message (Fig. 5). The handle contains information such as the buffer address, 

the match condition (to be discussed), whether the message has been "waited on," and the 
ul address of a TQE 1 0 to be "notified" when a message matching the match conditions has been 
f . received and copied into the buffer allocated at 32. Other than the TQE, this handle structure has 
j!f been used by PSSP in all prior versions of the MPCI/MPI library. At 36, the handle is enqueued 
20 on an unmatched message queue 40. The list (or queue) 40 is a list of handles for which buffers 
have been allocated and jnatch conditions posted, but a message satisfying these conditions has 
not yet been received, ^^^len the handle is enqueued in 40, the waited-on flag for that entry is set 
to 0, and the thread-notif / TQE address for that entry is set to NULL, since at this point the user 
has not indicated a desire to wait for the message. At 37, the program waits for a message to be 
25 received. 

At 37, a user decides to wait on a handle. That is, some thread will wait for a message to 
be received that matches the conditions listed in the particular handle passed by the message 
passing logic. The TQE 10 for that thread will be obtained (via the pipe_control 25 TQ_key 29) 
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and the state 13 set to READY at 38. At 40, the thread gets access to the internal message 
passing logic via a call to MPID Jock, which returns when the thread "owns" the message 
passing lock. The lock/utilock process is fully discussed in the patent application for SYSTEM 
FOR RESOURCE LOCBTUNLOCK CAPABILITY IN MULTITHREADED COMPUTER 
5 ENVIRONMENT by Govindaraju et al, Serial Number 09/1 39,255 filed 08/25/98 (Attorney 
Docket No. P09-98-144), incorporated herein by reference. 

At 42, the handle waited-on flag is set to 1 5 meaning that the message is now being waited 
on. At 44, the internal message passing logic (routine) is called. This routine reads any 
incoming messages and tries to match them with the match conditions on all the handles in the 
1 0 unmatched queue. If a message matches, the data is copied into the user's buffer, and the 
D thread-notify TQE address in the handle (if set), is used to identify the thread to be restarted; the 
m state of the thread-notify TQE is set to READY, and the count of READY TQE's 28 is updated 
^ in the Pipe_control structure 25. 

nj At 46 of Fig. 4B, when the internal message routine returns, the thread checks to see if 

the message it was waitir g for was one of the messages that was matched. If yes, the handle is 
m freed at 48, and unlocks the internal message passing logic at 50. The message reception is now 
m complete, and the thread continues at 52 to do other work. 

If at 46, the handle being waited on by this thread was not matched, then at 54, the thread 
will prepare to wait. It does this by setting the state=WAITING for its TQE, and putting the 
20 address of its TQE in the handle as the thread-notify address. At 56, a check is made to 

determine if the message handle for this thread is matched. It yes, the message is complete, the 
handle is freed at 48, the locked released at 50, and the thread continues on with other work at 
52. 

If the message handle is not matched at 56, a check is made at 58. At 58, the thread tests 
25 the Pipe_control TQ_ready count 28 to see if any threads are ready to run, or if its time slice has 
expired. Time slicing is well understood by tliose skilled in the art, and will not be discussed 
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further. At 60, the thread calls MPID_iinlock to vmlock the internal message passing routine.. 
This restarts the first RE/iD Y TQE in the TQE queue 20 by sending a thread signal to its TQE 
target signal condition. At 62, the thread calls the system call "yield," allows any restarted 
threads to be assigned to a processor and do useful work. Once this thread has been given 
control back from the operating system, it calls MPID_lock at 64. MPID_lock causes the 
enqueueing of the TQE and waiting for a signal to its TQE thread signal condition (Fig. 6). 
Thus, this thread will not return from MPIDJock called at 64 until it has been signaled, and it 
doesn't get signaled until it is READY (i.e. has a message matched), (or until there are no 
READY TQE's). Thus, this thread will sleep until a message arrives that matches the conditions 
set, and will not be restari:ed prematurely, even if it was the first thread to wait for a message. 
Once this thread gets control back from-MPIDJock, at 66 the thread will call the internal 
message passing routine to read messages and try to match them against any posted handle. The 
thread then loops back to 56, where it expects to find the message matched and thus finish via 
48-52. The MPID_lock/imlock routines are as follows: 

MPIDJock: 

a) get TQE element for this thread via Pipe_control TQ_key; 

b) lock the pipe_control.mutex lock; 

c) enqueue the TC)E on the TQE thread queue using standard linked-list management for 
the forward and back pointers; 

d) while Pipe_control_owner = 0, wait for the TQE Target signal condition. This is the 
point at which the thread will wait until a message arrives; 

e) claim lock ownership by setting Pipe_control_owner = TQE_id (14); 

f) dequeue the TQE, since it no longer is waiting for the lock; and 

g) unlock the Pipc_control_mutex lock. 
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MPID_unlock: 

a) lock the Pipe_c:ontrol_mutex lock; 

b) search the TQE queue and find the first TQE with state=READ Y, (or the first element 
if there are no READY TQE's); 

Bl) if the TQE element also contains a priority field, find the highest priority TQE with 
state^READY; 

c) send a thread condition signal to the Target contained in the TQE selected; 

d) relinquish lock ownership by setting Pipe_control_owner = 0; and 

e) unlock the pipe_control.mutex lock. 

Fig. 7 is a graph showing the improved performance of multithread message 
communication using the; present invention over the method used under the prior art. The prior 
art method is shown by curve 80, and the message multithread message communication of the 
present invention is shovm at 82. It will be understood that the present method results in an 
increased bandwidth of about 100%. 

While the preferred embodiment of the invention has been illustrated and described 
herein, it is to be understood that the invention is not limited to the precise construction herein 
disclosed, and the right is reserved to all changes and modifications coming within the scope of 
the invention as defined in the appended claims. 
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Claims: 



What is claimed is: 



11. A method For efficiently dispatching threads awaiting messages in a multi-threaded 

2 communication library comprising: 

3 preassigning threads to messages to be received; 

4 putting to sleep, those threads whose assigned messages have not been received; 

5 upon recei pt of a message, awakening its preassigned thread; and 

6 executing said awakened thread, thereby processing the received message. 

2. The method of claim 1 wherein the selection of the thread to be dispatched is based 

20 on its priority as set when the thread is put to sleep. 

I'J 3. The method of claim 1 wherein said preassigning threads step comprises: 

creating a thread-specific structure for each thread, each thread- specific structure 

3. having a ready flag and a condition variable unique to its preassigned thread; 
4pJ creating a tiandle for each message to be received; and 

S having a tliread invoke message passing logic for a particular handle, thereby 

associating the thread and the message. 

1 4. The method of claim 3 wherein said putting to sleep step comprises: 

2 enqueing for a received message, a preassigned thread-specific structure into a first 

3 queue; 

4 writing into said handle associated with the message received, an identification of 

5 said thread-specific structure enqued for the received message, and 

6 placing said thread-specific structure for the received message in the WAIT 

7 condition. 
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The method of claim 4 wherein said awakening step comprises; 
completing said received message; 

changing the condition of the thread-specific structure for the completed received 
structure to the READY condition; and 

dequeueing with a queue manager, the next thread-specific structure in said first 
queue in the READY condition and sending its thread a thread awakening condition 
signal. 

The method of claim 5 further comprising; 

allocating in said preassigning step, buffer space for storing messages to be 
received; and 

in said putting to sleep step, identifying in said handle the buffer in which the 
message associated with the handle is to be stored when it is received. 

The method of claim 6 wherein said completing said received message comprises 
storing said received message in the buffer identified in the associated handle for the 
received message. 

The methoil of claim 5 wherein said queue manager dequeues the next 
thread-specific stnicture using a First-In-First-Out policy. 

The method of claim 5 wherein said queue manager dequeues the next 
thread-specific stnicture using a Last-In-First-Out policy. 

The method of claim 5 wherein said queue manager dequeues the next 
thread-specific stnicture based on a priority value contained in said structure. 

The method of claim 5 further comprising obtaining a lock for the handle 
associated with said received message such that the awakened thread may process only the 
received message. 
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The method of claim 1 1 further comprising releasing said lock after said awakened 
thread has processed said received message such that said awakened thread may continue 
with other work. 

A computer program product comprising a computer useable medium having 
computer readable program code means therein for efficiently dispatching threads 
awaiting messages in a multi-threaded communication library, said computer readable 
program code means in said computer program product comprising: 

computer readable program code means for preassigning threads to messages to be 
received; 

computer readable program code means for putting to sleep, those threads whose 
assigned messages have not been received; 

computer readable program code means for, upon receipt of a message, awakening 
its preassigned thread; and 

computer readable program code means for executing said awakened thread, 
thereby processing the received message. 

The computer program product of claim 13 wherein the selection of the thread to 
be dispatched is based on its priority as set when the thread is put to sleep. 

The computer program product of claim 13 wherein said computer readable 
program code mcms for preassigning threads comprises: 

computer readable program code means for creating a thread-specific structure for 
each thread, each thread-specific structure having a ready flag and a condition variable 
unique to its preassigned thread; 

computer readable program code means for creating a handle for each message to 
be received; and 

computer readable program code means for having a thread invoke message 
passing logic for a particular handle, thereby associating the thread and the message. 
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1 16. The computer program product of claim 1 5 wherein said computer readable 

2 program code meais for putting to sleep comprises: 

3 computer readable program code means for enqueing for a received message, a 

4 preassigned thread -specific structure into a first queue; 

5 computer n^adable program code means for writing into said handle associated 

6 with the message received, an identification of said thread-specific structure enqued for 

7 the received message, and 

8 computer readable program code means for placing said thread-specific structure 

9 for the received m^^ssage in the WAIT condition. 



The computer program product of claim 16 wherein said computer readable 
program code means for awakening comprises; 

computer readable program code means for completing said received message; 

computer readable program code means for changing the condition of the 
thread-specific structure for the completed received structure to the READY condition; 
and 

computer readable program code means for dequeueing with a queue manager, the 
next thread-specific structure in said first queue in the READY condition and sending its 
thread a thread awakening condition signal 



1 18. The computer program product of claim 1 7 further comprising; 

2 computer readable program code means for allocating in said preassigning step, 

3 buffer space for storing messages to be received; and 

4 said computer readable program code means for putting to sleep includes, 

5 computer readable program code means for identifying in said handle the buffer in which 

6 the message assoc iated with the handle is to be stored when it is received. 

1 19. The computer program product of claim 1 8 wherein said computer readable 

2 program code means for completing said received message comprises computer readable 



1 17. 

90 
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3 program code means for storing said received message in the buffer identified in the 

4 associated handle :x>r the received message. 

1 20. The compvLter program product of claim 1 7 wherein said queue manager includes 

2 computer readable program code means for dequeueing the next thread-specific structure 

3 using a First-In-Fii^st-Out policy. 

1 21 . The computer program product of claim 1 7 wherein said queue manager includes 

2 computer readable program code means for dequeueing the next thread- specific structure 

3 using a Last-In-First-Out policy . 

Ig 22. The computer program product of claim 17 wherein said queue manager includes 

^ computer readable program code means for dequeueing the next thread- specific structure 

5 based on a priorit}^ value contained in said structure. 

IS; 23. The computer program product of claim 17 further comprising computer readable 

2^ program code meeins for obtaining a lock for the handle associated with said received 
message such that the awakened thread may process only the received message. 

IQ 24. The computer program product of claim 23 further comprising computer readable 

2™ program code melius for releasing said lock after said awakened thread has processed said 

3 received message such that said awakened thread may continue with other work. 

1 25. An apparatus for efficiently dispatching threads awaiting messages in a 

2 multi-threaded communication library comprising: 

3 means for preassigning threads to messages to be received; 

4 means for putting to sleep, those threads whose assigned messages have not been 

5 received; 

6 means for, upon receipt of a message, awakening its preassigned thread; and 

7 executing said awakened thread, thereby processing the received message. 
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1 26. The apparatus of claim 25 wherein the selection of the thread to be dispatched is 

2 based on its priority as set when the thread is put to sleep. 

1 27. The apparatus of claim 25 wherein said means for preassigning threads comprises: 

2 means for c'reating a thread-specific structure for each thread, each thread-specific 

3 structure having a ready flag and a condition variable unique to its preassigned thread; 

4 means for (treating a handle for each message to be received; and 

5 means for having a thread invoke message passing logic for a particular handle, 

6 thereby associating the thread and the message. 

1 28. The apparatus of claim 27 wherein said means for putting to sleep comprises: 

20 means for enqueing for a received message, a preassigned thread-specific structure 

3^^ into a first queue; 

4Jr; means for witing into said handle associated with the message received, an 

5-p identification of said thread-specific structure enqued for the received message, and 

6ry means for ]3lacing said thread-specific structure for the received message in the 

71, WAIT condition. 

Ifli 29. The apparatas of claim 28 wherein said means for awakening comprises; 

2^^ means for completing said received message; 

3 means for changing the condition of the thread-specific structure for the completed 

4 received structure to the READY condition; and 

5 means for dequeueing with a queue manager, the next thread-specific structure in 

6 said first queue in the READY condition and sending its thread a thread awakening 

7 condition signal. 

1 30. The apparatus of claim 29 further comprising; 

2 means for allocating in said preassigning step, buffer space for storing messages to 

3 be received; and 
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in said meajis for putting to sleep, means for identifying in said handle the buffer 
in which the message associated with the handle is to be stored when it is received. 

The apparatus of claim 30 wherein said means for completing said received 
message comprises means for storing said received message in the buffer identified in the 
associated handle for the received message. 

The appara tus of claim 29 wherein said queue manager includes means for 
dequeueing the next thread-specific structure using a First-In-First-Out policy. 

The appara.tus of claim 29 wherein said queue manager includes means for 
dequeueing the ne^t thread-specific structure using a Last-In-First-Out policy. 

The apparatus of claim 29 wherein said queue manager includes means for 
dequeueing the ncKt thread-specific structure based on a priority value contained in said 
structure. 

The apparaitus of claim 29 further comprising means for obtaining a lock for the 
handle associated with said received message such that the awakened thread may process 
only the received message. 

The appara^tus of claim 35 further comprising means for releasing said lock after 
said awakened thread has processed said received message such that said awakened thread 
may continue with other work. 

An apparatus comprising: 
a data processing system; 

a multi-threaded communication library in said data processing system; 
a thread dispatcher in said data processing system for efficiently dispatching 
threads awaiting messages in said multi-threaded communication library; 
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computer code which preassigns threads to messages to be received; 
computer code which puts to sleep those threads whose assigned messages have 
not been received; 

computer code which, upon receipt of a message, awakens its preassigned thread; 

and 

computer code which executes said awakened thread, thereby processing the 
received message. 

The appara tus of claim 37 wherein the selection of the thread to be dispatched is 
based on its priority as set when the thread is put to sleep. 

The apparatus of claim 37 wherein said computer code which preassigns threads 
comprises: 

computer c:ode which creates a thread-specific structure for each thread, each 
thread-specific structure having a ready flag and a condition variable unique to its 
preassigned thread ; 

computer c;ode which creates a handle for each message to be received; and 
computer code which causes a thread invoke message passing logic for a particular 
handle, thereby associating the thread and the message. 

The apparatus of claim 39 wherein said computer code which puts to sleep 
comprises: 

computer c;ode which enqueues for a received message, a preassigned 
thread-specific structure into a first queue; 

computer c'ode which writes into said handle associated with the message received, 
an identification of said thread-specific structure enqued for the received message, and 

computer c;ode which places said thread-specific structure for the received message 
in the WAIT condition. 
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1 41 . The apparatus of claim 40 wherein said computer code which awakens comprises; 

2 computer code which completes said received message; 

3 computer code which changes the condition of the thread- specific structure for the 

4 completed received structure to the READY condition; and 

5 computer code which dequeues with a queue manager, the next thread-specific 

6 structure in said fu'st queue in the READY condition and sending its thread a thread 

7 awakening condition signal. 

1 42. The appar£tus of claim 41 further comprising; 

2 in said computer code which preassigns, computer code which allocates buffer 
^.3 space for storing messages to be received; and 

4p.. in said computer code which puts to sleep, computer code which identifies in said handle 

SO the buffer in which the message associated with the handle is to be stored when it is 

^ received. 

43. The apparatus of claim 42 wherein said computer code which completes said 

i received message :;omprises computer code which stores said received message in the 
buffer identified in the associated handle for the received message. 

Ip 44. The apparatus of claim 4 1 wherein said queue manager includes computer code 

2^^ which dequeues tfc e next thread-specific structure using a First-In-First-Out policy. 

1 45, The apparatus of claim 41 v^^herein said queue manager includes computer code 

2 which dequeues the next thread-specific structure using a Last-In-First-Out poUcy. 

1 46. The apparatus of claim 41 wherein said queue manager includes computer code 

2 which dequeues the next thread-specific structure based on a priority value contained in 

3 said structure. 
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47. The apparai:us of claim 41 further comprising computer code which obtains a lock 
for the handle associated with said received message such that the awakened thread may 
process only the received message. 

48. The apparatus of claim 47 further comprising computer code which releases said 
lock after said awakened thread has processed said received message such that said 
awakened thread may continue with other work. 
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THREAD DISPATCHER FOR MULTI-THREADED COMMUNICATION LIBRARY 

Abstract of the Disclosurt ^: 

Method, computer program product, and apparatus for efficiently dispatching threads in a 
multi-threaded communication library which become namable by completion of an event. Each 
thread has a thread-speciiic structure containing a "ready flag" and a POSIX thread condition 
variable unique to that thread. Each message is assigned a "handle". When a thread waits for a 
message, thread-specific structure is attached to the message handle being waited on, and the 
thread is enqueued, waitiag for its condition variable to be signaled. When a message completes, 
the message matching logic sets the ready flag to READY, and causes the queue to be examined. 
The queue manager scans the queue of waiting threads, and sends a thread awakening condition 
signal to one of the threads with its ready flag set to READY, The queue manager can 
implement any desired policy, including First-In-First-Out (FIFO), Last-In-First-Out (LIFO), or 
some other thread priority^ scheduling policy. This ensures that the thread which is awakened has 
the highest priority message to be processed, and enhances the efficiency of message delivery. 
The priority of the message to be processed is computed based on the overall state of the 
communication subsystem. 
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